Home Software download User Guide List of robots Contact Forum Links

Modular design of Web scraping robots

October 10, 2009

Never before, creation of Web robots becomes as routine as creating a Word or Excel document thanking IRobotSoft's innovative Web scraping technology. One can easily create robots to automate repetitive Web explorations. This means that access to Web information can be more efficient, and becomes less time consuming. Although the robots are still not intelligent enough to completely replace all human Web explorations, they can surely replace many on the handling of Web interfaces that are either data intensive or manual demanding. Consider the use of Web robots for adding new friends everyday on your social Web accounts, for searching new friends who match your preferences, and for updating your personal information on multiple social networks. There are tons of use scenarios where you definitely long for an automated Web robot that can help you out of your busy Web agenda.

Design of Web robots can be easy or difficult, depending on the complexity of your work. For easy work, such as clicking on a button every few seconds, you don't need much development work. You can create a robot with only one action to do this. You don't need to consider too much about the structure or the logic of the Web robot. Indeed, you don't even need any programming skill to do this. However, for complex work, you do need some planning and clever design of the robot architecture. Yes, you want to think of how to use variables to keep the extracted Web content, and you need to consider how to break down a big job into smaller pieces so that you can attack each piece and make it work with ease. For example, if you want to create a Web robot that automatically logon to your gmail account and check for new emails or send messages from it, you may want to break it down into three pieces: one to logon to gmail account, one to check new emails and one to send message. You can then create three robot tasks, one for each piece, and then combine them together in main routine.

Yes, IRobot encourages modular robot design and provides structures to support this. In the design of complex Web robots, you want to maintain independent Web exploration sequences in separate robot tasks. This allows you to design and test each task individually and allows you to reuse the robot task conveniently. For example, you may want to call the gmail logon task from either the email-checking task, or from the message-sending task in a single gmail-processing robot. IRobot also allows you to reuse robot tasks across multiple robots. For example, you may have another stock-monitoring robot that automatically detects changes of certain stock prices, and you want it to send a message to your gmail account whenever a certain amount of change is detected, then you can reuse your gmail-processing robot and call its message-sending task from the stock-monitoring robot.

With a good robot structure, you can easily scale up your robots to complete multiple tasks. The robots you have created may show complex decision-making process that looks like possessing high-level machine intelligence, but they have the precise Web behaviors you need.


Copyright 2009 IRobotSoft All Rights Reserved.
Privacy Statement